New High Performance GPGPU Code Transformation Framework Applied to Large Production Weather Prediction Code
نویسندگان
چکیده
We introduce “Hybrid Fortran”, a new approach that allows a high performance GPGPU port for structured grid Fortran codes. is technique only requires minimal changes for a CPU targeted codebase, which is a signicant advancement in terms of productivity. It has been successfully applied to both dynamical core and physical processes of ASUCA, a Japanese mesoscale weather prediction model with more than 150k lines of code. By means of a minimal weather application that resembles ASUCA’s code structure, Hybrid Fortran is compared to both a performance model as well as today’s commonly used method, OpenACC. As a result, the Hybrid Fortran implementation is shown to deliver the same or beer performance than OpenACC and its performance agrees with the model both on CPU and GPU. In a full scale production run, using an ASUCA grid with 1581 x 1301 x 58 cells and real world weather data in 2km resolution, 24 NVIDIA Tesla P100 running the Hybrid Fortran based GPU port are shown to replace more than 50 18-core Intel Xeon Broadwell E5-2695 v4 running the reference implementation an achievement comparable to more invasive GPGPU rewrites of other weather models.
منابع مشابه
Hybrid Fortran: High Productivity GPU Porting Framework Applied to Japanese Weather Prediction Model
In this work we use the GPU porting task for the operative Japanese weather prediction model “ASUCA” as an opportunity to examine productivity issues with OpenACC when applied to structured grid problems. We then propose “Hybrid Fortran”, an approach that combines the advantages of directive based methods (no rewrite of existing code necessary) with that of stencil DSLs (memory layout is abstra...
متن کاملDeveloping a High Performance Gpgpu Compiler Using Cetus
In this paper we present our experience in developing an optimizing compiler for general purpose computation on graphics processing units (GPGPU) based on the Cetus compiler framework. The input to our compiler is a naïve GPU kernel procedure, which is functionally correct but without any consideration for performance optimization. Our compiler applies a set of optimization techniques to the na...
متن کاملAn efficient secure channel coding scheme based on polar codes
In this paper, we propose a new framework for joint encryption encoding scheme based on polar codes, namely efficient and secure joint secret key encryption channel coding scheme. The issue of using new coding structure, i.e. polar codes in Rao-Nam (RN) like schemes is addressed. Cryptanalysis methods show that the proposed scheme has an acceptable level of security with a relatively smaller ke...
متن کاملIntegrated Frameworks for Earth and Space Weather Simulation
Simulations of Earth and space weather require the representation and coupling of distinct physical domains in a flexible, computationally efficient manner. There is increasing call to interface Earth and space models, as the interplay of phenomena between these domains is an active topic of basic research, and a key factor in operational prediction systems. Software frameworks have been develo...
متن کاملBoosting Java Performance Using GPGPUs
Heterogeneous programming has started becoming the norm in order to achieve better performance by running portions of code on the most appropriate hardware resource. Currently, significant engineering efforts are undertaken in order to enable existing programming languages to perform heterogeneous execution mainly on GPUs. In this paper we describe Jacc, an experimental framework which allows d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.05839 شماره
صفحات -
تاریخ انتشار 2018